Cooperative Distributed Uplink Cache over B5G small cell networks

The emergence of content-centric network has resulted in a substantial increase in data transmission in both uplink and downlink directions. To tackle the ensuing challenges of network congestion and bottlenecks in backhaul links within Beyond Fifth Generation (B5G) networks, data caching has emerged as a popular solution. However, caching for uplink transmission in a distributed B5G scenario poses several challenges, including duplicate content matching and users’ obliviousness about cached contents. Furthermore, it is important to maximize available space by caching the most popular contents in a distributed manner. In this paper, we propose two schemes for uplink transmission in distributed B5G SCNs. The first scheme focuses on content matching to eliminate duplicate contents among distributed caches, while the second scheme redistributes un-duplicated cached contents among distributed caches based on their available space and content’s size. These approaches aim to enhance energy and spectral efficiency by reducing unnecessary uploads and optimizing distributed content caching, in addition to improve the content delivery. The analysis shows that the proposed schemes outperform the existing schemes by improving the cache hit ratio, cache hit probability, overall distributed cache efficiency, and diversity by 29.17%, 74.89%, 24.17%, and, 80%, respectively. Furthermore, the average throughput, Spectrum Efficiency (SE), and Energy Efficiency (EE) of the access network is improved by 17.78%, 18%, and 78%, respectively. Besides that, the EE and SE of both the sidehaul and backhaul links of the SBSs are also improved.


Introduction
In recent years, there has been a significant increase in data production and demand from end-users, leading to exponential growth in data streaming.It is expected that the number of internet users will reach 5.3 billion by 2023 [1].This growth has profound implications for the network capacity of Beyond 5th Generation (B5G) networks, including challenges such as traffic load, congestion, latency, content popularity, energy consumption, and spectrum usage, as well as storage and content delivery [2][3][4][5][6].
To address these challenges, small cell technology is being widely advocated in 5G and B5G networks.However, implementing small cells introduces additional complexities, such as varying data requirements of users based on time and location, and potential bottlenecks in the uplink direction due to the improved data transmission rate in small cell networks (SCNs) [7][8][9].
Another approach to tackle the data explosion is by implementing caching in the network (caching in the air).Some significant researches have been focused on distributed caching in the cellular networks such as SCNs, particularly in the context of downlink caching [10,11], while in the context of the uplink caching [12][13][14][15][16], which is the primary focus of this paper.
For downlink transmission, the authors in [10], presented "eNCache" as a cooperative caching method to perform the cooperation among the routers in the neighborhood to the content delivery via on-path caching concept.When, the eNCache handled and decided at every router to decide the interaction between routers and neighbors by providing an extension to the Named Data Networking Interest packet structure, the content delivery improved, in addition to enhance the cache hit ratio and diversity.While the authors in [11], proposed Dijkstra algorithm based cooperative caching strategy for UAV-assisted edge computing system to improve the quality of service (QoS) by reducing the content delivery delay by computing the delay of the content transmission among neighboring nodes, in addition the popular contents stored in the cache of the small-BSs to provide wireless data transmission services, that reduced the traffic load of the network and content transmission delay.
However, in the uplink direction, some researches have been presented such as, the authors in [12], proposed a novel upload cache architecture that supports parallel uploading of segmented files.Meanwhile in [13] the researchers, introduced an uplink cache system for delaytolerant SCNs and analyzed the effectiveness of cache size for uplink caching.The authors suggested eliminating duplicated contents at the Small Base Station (SBS) by matching the hash key of file chunks after uploading the actual content.However, this approach was deemed impractical for the uplink scenario.
In the work presented by the authors in [14], they proposed a novel multiple-input multiple-output (MIMO) network architecture that incorporated a large number of Base Stations (BSs) to facilitate cache-enabled uplink transmission.One of the key contributions of that study was the introduction of the modified von Mises distribution as a popularity distribution function.By utilizing this distribution, they were able to derive the outage probability and establish a direct correlation between cache storage and outage probability.Furthermore, their observations revealed that increasing the cache storage space and network density resulted in an enhanced delivery rate.
While the papers [12][13][14] proposed innovative uplink cache architectures and schemes to enhance network performance, none of them considered the impact of caching on both energy efficiency (EE) and spectrum efficiency (SE), in addition to the distributed and cooperative caching.Moreover, the existing literature used SBSs [13] and BSs [14] for content matching and elimination of duplicated contents, assuming that the Mobile Stations (MSs) would be unaware of cached contents.This could result in unnecessary content uploads, which is undesirable.In contrast, in [15] introduced the Broadcast cache assist uplink (BCAU) scheme, which performs matching between attributes of cached contents and incoming content at an MS level.As a result, redundant uploading of available content(s) in the SBS cache before actual content transmission is avoided.This approach significantly improved the EE and throughput of uplink transmission over B5G-SCN.
In [16], the authors presented the cooperative distribution of the SBS cache through a framework.This framework generated thorough to create the contents list from all the SBSs and Macro Base Station (MBS) in the coverage area to assist the uplink transmission by eliminating the duplicated contents among them.Additionally, content matching at an MS was emphasized, which effectively improved energy and spectrum efficiency.Furthermore, incoming contents with large sizes were splitted, and their fractions were cached in the distributed cache that improved the cache hit ratio.However, the size of existing contents placed in the distributed cache and their effects on the distributed cache's efficiency were not discussed.
Based on the previous discussions, the existing literature pointed out that the size of cached contents significantly demeans the acting of the distributed cache and the cooperation among the distributed cache.Contents with large/medium sizes waste more space and decrease the effectiveness of the cache until senility, which is not desirable.On the other hand, each cache also requires enough free space for new caching or their segments, which can be achieved by performing the Cache Replacement Policy (CRP) to evict some of the cached contents [17].However, this results in more energy consumption and delays because the MS's waiting time and connection time increase.In addition to wasting more storage and delay of content delivery.
The motivation by these challenges to put forward the novel schemes for distributed cacheenabled uplink transmission.The main aim of this paper is to enhance the efficiency of the overall distributed cache by increasing the cache hit ratio and cache hit probability, while simultaneously reducing energy consumption and delays caused by cache replacement policies.Additionally, aim to minimize wasted cache storage space and allocate suitable free space in each cache for caching new contents, ultimately leading to improve the distributed cache's performance.That is performed by proposed two schemes, namely, Proposed Distributed Uplink Cache Scheme, and Re-Distribute Un-Duplicated Cached Content (RUCC).The first proposed scheme aims to address the challenge of elimination of the duplicated distributed cached contents and then generated the list of un-duplicated distributed cached contents [16].While the second scheme aims to address the challenge of content segmentation with large/ medium sizes into smaller sizes and redistributed the un-duplicated distributed cached contents among the distributed cache.That improved the performance and cache hit ratio, in addition to maximize available space by caching the most popular contents in a distributed manner.The main contributions of this paper are based on [16] and, summarized in the following: • Based on the matching and finding the similarity among distributed cached contents [16].
The un-duplicated cached contents of the distributed cache are generated in the list and broadcast that list to all MSs to be used as a map for the mobiles to decide whether to upload the content(s) or not.
• Segmenting the un-duplicated distributed cached contents with a large/medium sizes into many smaller segments.
• Redistributed the smaller segments of the un-duplicated cached contents among distributed cache based on the cache free space and the size of un-duplicated cached contents.
• The performance of the proposed schemes will be evaluated the efficiency of the distributed cache in the terms of the uplink of the ratio and probability of the cache hit, overall cache efficiency, and diversity method.In addition, the throughput, SE, and EE of the access link will be evaluated too.Besides that, the merits of the proposed design will be analyzed mathematically.
The organization of this paper as follows: the system model is discussed in Section II.Section III presented the proposed scheme for the distributed uplink cache.Experimental design and evaluation are discussed in Section IV, numerical results in Section V follows conclusion in sections VI.

System model
In this section, the description of the system model of the distributed cache presented in details, which is consisting of Small Base Stations (SBSs),and Macro Base Station (MBS) along with the Mobile Stations (MSs).Fig 1 illustrates the overall system architecture.
The notations of this paper shown in the Table 1.

Cache model
The system incorporates a cooperative distributed cache M + 1 consisting of caches located at the SBSs and an MBS.This distributed cache operates as a unified entity with a combined capacity of ω to store a total of L contents.The capacity allocation for the distributed cache can be expressed as follows: where SðG ∁ Þ and SðC B j Þ represents the storage capacity of the cache of the MBS, and SBS, respectively.Let T ¼ f1; 2; . . .; Tg be the slots of the time set, and t 2 T is denoted to the time interval when a set of MSs uploading contents to the target SBSs.Each time slot consists of many event points and U t j;i is the time of the connecting of an MS U j,i .2.2.1 Cache storage.Let ∁ ¼ fC B j : j ¼ 1; 2; . . .; Mg the list of the distributed cache of the SBSs.
The set D represents the popular contents such that D ¼ fD l : l ¼ 1; 2; . . .; o : 1 � o < Lg are stored in the C B j .The value of ω represents the total count of contents cached in C B j .
The set of the cached contents of C B j denoted by C B j ;d l ¼ fC B j ;d 1 ; C B j ;d 2 ; . . .; C B j ;d o : 1 � j � M; 1 � l � og, where the variable l serves as the sequential identifier for the cached content within C B j .
Referring to [18] and utilizing the Zipf distribution, the relative popularity or frequency of occurrence for each cached content item C B j ;d l is denoted by Popu C B j ;d l , and can be represented as; where the popularity rank of the C B j ;d l is indicated by r C B j ;d l .The value of the exponent character is denoted by δ.Moreover, each C B j ;d l is associated with a set of attributes of each content such as hash key, name, length, size, and other relevant properties.These attributes are collectively represented by The PðC B j ;d l ;k Þ will be used for performing the matching among distributed cached contents to determine and eliminate the duplicate content.While the size of content is used for segmentation each content into smaller sizes for redistributed among distributed cache.The collection of sizes of cached contents within C Bj is represented by S(D) and can be defined as Using [18][19][20][21], the segmented cached contents approached is considered such that the segmented is divided into Q segments and represented as The size of each segment is determined based on two factors: • The size of the cached content, which is denoted as SðC B j ;d l Þ.
• The available free space in the cache of the local Small Base Station (SBS) as well as its neighboring SBSs.
The cache's free space of an SBS B j is indicated by B j , and it can be calculated as follows:

Efficiency of a cache.
The efficiency of a cache can be assessed by analyzing several key metrics.These include the cache hit ratio, which measures the proportion of requested content that is successfully retrieved from the cache.The cache miss ratio, represented, which indicates the fraction of requested content that is not found in the cache and the retrievation from the original source is needed.
According to [22,23], the ratio of the cache hitting, and missing of each SBS C B j are presented by Hit C B j r and Miss C B j r , respectively and are given as: Referring to the work in [24], the cache hit probability is referred to the likelihood of a randomly selected active MS finding its uploading content in the distributed cache.It is denoted by hit prob , and is given as: where, C d l prob is the caching probability of content C B j ;d l .prob d l r is the probability of content C B j ;d l according to [15].

Communication model
In this subsection, the uplink transmission of MSs presented in details.Each MS is served by its associated SBS based on their respective locations.The MS establishes a connection with its local SBS to initiate the upload of its content.The selection of the local SBS is determined by evaluating the Maximum Averaged Receive-Signal-Strength (Max-RSS).The Max-RSS is calculated considering factors such as association probability, handoff probability, and coverage probability, which are determined using a multi-directional path loss model and the K-means algorithm.While the coverage probability is evaluated using schemes like Maximum Received Power Association (MRPA) and Nearest BS Association (NBA) [25][26][27].
Referring to the works of [16,28,29], the capacity of uplink transmission of an MS denoted by < ul U j;i , and can be described as follows: Here, the channel bandwidth is indicated by B. The Signal-to-Interference-and-Noise Ratio (SINR) of the received signal from Mobile Station (U j,i ) at its serving SBS B j is denoted as SINR U j;i .It can be mathematically represented as follows: Here, the transmit power of an uplink direction of MS U j,i indicated by TP ul U j;i .The term H ul i;j corresponds to the uplink channel gain.The Euclidean norm is represented by k.k.The variable d(U j,i , B j ) signifies the distance between mobile station U j,i and small base station B j .The path loss exponent is denoted by α.The noise power spectral density at the MS is represented by σ 2 .Lastly, I i denotes the set of interfering MSs that are served by the same SBS B j , and i, j are denoted to the mobile station and its serving SBS, respectively.

Energy consumption (EC) model
Based on [30] and referring to the analysis of the previous work in [15], the energy consumption of the MS U i,j is presented by EC U i;j , and can be expressed as, where E m , E op , TE ul U j;i , and E r are energy consumed for execution matching, other operations, transmitting and receiving, respectively.
E r takes a value of the energy cost of the receiving packet(s) from the serving SBS B j .Whilst, the TE ul U j;i is calculated as follows: If the cache is hitting, then the TE ul U j;i is only calculated for the transmitting the Message of Target Destination (MoTD) where the MoTD contains a set of the attributes of content such as hash key and the target destination of uploading to the target SBS (B j ), else the TE ul U j;i is calculated for transmitting the whole content.The average energy consumption of the all mobiles is denoted by Avg U EC , and is calculated by dividing the summation energy consumption of mobiles according to (9) and the total number of mobiles N, and is given as,

Proposed scheme for Distributed Uplink Cache
The distributed cache faces three key challenges: content duplication among caches, lack of knowledge about cached contents at an MSs, and the need to segment large or medium-sized cached contents for distributed storage.To overcome these challenges and enhance the MS's quality of experience, an algorithm for distributed uplink caching is two schemes, namely, Distributed Uplink Cache Scheme (DulC), and Re-Distribute Un-Duplicated Cached Content (RUCC) Scheme proposed.The DulC is used to perform the matching among the distributed cached contents and eliminated the duplicated among them, and then generated the list of Un-Duplicated Content List (UDCL)to be used by second scheme.This scheme presented in Algorithm 1, utilizes the Adaptive Content Validity Period (AcVP) to validate duplicate cached contents with the assisted the Broadcast Cache Assist Uplink (BCAU) algorithm in [15] (previous work) to perform the matching an incoming content with the list of the distributed cached contents at the MSs level to upload only dissimilar content.Furthermore, to address the challenge of content segmentation and improve the performance and cache hit ratio, the Re-Distribute Un-Duplicated Cached Content (RUCC) Scheme is proposed.This scheme presented in Algorithm 2, focuses on segmenting and redistributing the cached contents to smaller sizes for efficient distributed storage.

Proposed Distributed Uplink Cache Scheme (DulC)
The proposed scheme consists of the following steps: 3.1.1Generating a list of distributed cached contents.Ensuring that an MS possesses a comprehensive list of cached contents is crucial to avoid nonessential uploads.This subsection provides a detailed explanation of the approach used to generate the contents list.The MBS plays a significant role in this process by creating two distinct lists: the Un-Duplicated Content List (UDCL) and the Duplicated Content List (DCL).The UDCL is primarily designed for content matching purposes, while the DCL serves the purpose of identifying and eliminating identical contents.

Processing of Contents at MBS and SBSs.
During the preprocessing stage, a comprehensive list of the contents present in the distributed cache is generated at both the MBS and SBSs within the network.This step involves compiling all the cached contents from the various cache locations.
The list of contents of each distributed cache at the MBS/SBSs is denoted by C o�k B j =G ∁ .It can be mathematically represented as follows: where the total number of cached contents of each cache is presented by ω rows.The j is the counter of both the MBS and SBSs.The maximum number of attributes of each content is denoted by κ.The cached content is denoted by d ?and its properties are indicated by P d ?;k j , where j, i, κ are represented the serial number of MBS/SBS, cached content, and attributed of each cached content, respectively.Then, where jC B j j ¼ M i.e.Eq (12) is applicable on all the SBSs, B j , j = 1, 2, . .., M.
3.1.2.1 Consolidated list of distributed cached contents.An MBS will collect the lists of distributed cached contents as shown in (11).Then, an MBS applies row-wise combination function to generate a consolidated list C L d �k CoD d as following: . . .
where the total number of content of distributed cache is denoted by L d .

Filtering similar contents.
To eliminate duplication, the MBS undertakes a process of matching the attributes of contents from the consolidated list generated (13), utilizing the dissimilarity function (14).This function is employed to compute dissimilarity, distinguishing typical contents (y) from target contents (y • ) [15,31].The dissimilarity calculation is as follows: The dissimilarity function, denoted as dissim(y, y • ), measures the difference between the run-of-the-mill content y and the target content y • .The attributes of the content are represented by ðP d Þ, with κ being the maximum number of attributes.The contribution of feature ðP d Þ to the dissimilarity between y and y • is denoted as Dis ðP d Þ y;y � , while @ ðP d Þ y;y � is the indicator.The similarity between y and y • is calculate as follows: where, the similarity of contents (y, y • ) is shown by sim(y, y • ).
The list of the values of similarity is denoted by SIM L d �k .It can be mathematically represented as follows: Determining the value of the threshold of the matching sim sh th to select the similar contents as sim ¼

Similar;
simðy; y � Þ >¼ sim sh th Dissimilar; simðy; y � Þ < sim sh th : By employing this approach, it becomes possible to identify duplicate contents present in various caches, which can then be promptly removed or evicted only from the caches of the SBSs.This ensures that there is sufficient space available to accommodate new and unique contents, optimizing the overall efficiency and performance of the distributed caching system.
3.1.3Validity of cached contents at MBS and SBSs.Once similar contents have been identified, it is crucial to assess the validity of the existing cached contents stored in the distributed cache.It is necessary to ensure that duplicate contents are retained in only one cache, while the rest are evicted.
Among the distributed cache placed at the MBS and SBSs, all duplicate contents can be stored at the MBS since it has a larger cache size.However, when dealing with duplicate contents among the SBS caches, it becomes important to evaluate their validity and determine which cache should retain the content, while others evict it.
To address this, an Adaptive Content Validity Period (AcVP) function is proposed.The AcVP function aims to calculate the probability of whether to keep or evict a content from any of the SBS caches.This function assists in making informed decisions about content placement within the distributed cache system.
The value of the (AcVP) of the cached content is denoted by _ C B j ;d l and can be calculated by; Here, RE B j represents the remaining energy of an SBS B j .C f S B j denotes the available cache space of an SBS, which is determined using Eq (3).U j,i corresponds to the total number of MSs served by an SBS B j .% C B j ;d l represents the popularity of the cached content, as determined by Eq (2).T C B j ;d l R signifies the remaining time for a content to remain in the cache, and it is calculated as follows: where is the content's expire time.t C B j ;d l is the time of the last hit of that content.
After performing the content similarity, the AcVP is executed among the duplicate contents.The duplicate content with a maximum value of _ C B 1 ;d l is kept in its cache, in addition, it will be added to the Un-Duplicated Content List (UDCL) to be un-duplicated content.The UDCL can formally be seen in (20) where the L • rows present the total number of un-duplicated contents of all distributed cache, which are computed as a counter during creating the list when algorithm 1 is executed.
While the remaining can be added to the Duplicate Content List (DCL), and then removed/ evicted for their SBSs caches.The DCL can be mathematically represented as follows: where � L is the total number of the duplicate contents across SBSs caches to be eliminated from their caches.
3.1.4 The final list generation.Upon determining the validity of duplicate contents, the process continues to generate the final lists of duplicated and un-duplicated contents.These lists are represented by Eqs ( 21) and (20), respectively.The Un-Duplicated Content List (UDCL) serves as a useful reference for an MS, enabling them to make informed decisions regarding whether to upload content or send an MoTD instead of uploading the real content.
3.1.5Duplication elimination in the cache.The UDCL and DCL, using AcVP as depicted in (21), are shared with all the SBSs.The contents identified with low AcVP in the DCL are considered for eviction from their respective caches, denoted by victimð_ C B j ;d l ; C B j ; � TÞ, where � T denotes the time of evicted content.Moreover, each SBS B j broadcasts the UDCL to all the MSs it serves, enabling content matching accordingly.Due to continuous updates in the distributed cached contents and their varying popularity caused by actions like views, uploads, shares, downloads, etc., the UDCL undergoes periodic updates and is rebroadcasted to all the MSs to maintain content consistency across the distributed cache and the MSs.

Proposed Distributed Uplink Cache Scheme (DulC).
Based on the discussion, the DulC is proposed as shown in Algorithm 1.Each previous sections from (3.1.1)to (3.1.5)corresponds to a step of the proposed algorithm.The steps of the proposed algorithm can be summarized as follows: The proposed DulC Algorithm consists of 5 major steps.In step 1, involves the processing of distributed cached contents at an MBS.In Step 2, similarity checks are performed to identify duplicate contents.Step 3, focuses on checking the validity of the content, resulting in the generation of two lists: UDCL and DCL.Step 4, encompasses the removal of duplicate contents from the target SBSs and the dissemination of the UDCL to all the MSs.Finally, in Step 5, the MSs utilize the un-duplicated list for content matching.

Complexity of the proposed algorithm.
As mentioned and discussed earlier, the total number of distributed caches represented by M • , where each cache stores ω contents, resulting in a total of L contents.Each content is characterized by κ attributes.The un-duplicated and duplicated contents among the distributed caches are denoted as L • and � L, respectively.
Given these parameters, the complexity of the main operations in the iterative process can determine as follows: • The processing of cached contents at the distributed caches has a time complexity of O(M • .ω).
• Filtering similar contents by performing attribute matching among the contents requires a complexity of O(L 2 ).
• Eliminating duplicate contents from the target distributed cache has a time complexity of Oð � LÞ.
Therefore, the overall time complexity of the algorithm can be expressed as

Re-Distribute Un-Duplicated Cached Content (RUCC)
Based on the previous discussion, the duplicate cached contents were eliminated from the target distributed cache.For instance, some caches have more free space and others less based on the total size of the cached contents which eliminated from each cache.To gain appropriate free space in each cache with the objective of caching the new contents.In addition to avoiding the repeating of execution of the cache replacement policy (CRP), which may be evicting some of the popular contents from that cache.That results in increasing the energy consumption and delay of contents delivery.Un-duplicated distributed cached contents with medium and large sizes are required for segmenting into smaller sizes for redistributing and storing distributively.
3.2.1 Proposed RUCC scheme.To do that, the Re-Distribute Un-Duplicated Cached Content (RUCC) scheme is proposed as shown in Algorithm 2. The RUCC is performed in 4 major steps, Step 1, determine the target cached contents, which their size S(D) exceeding or equal to the content verification size V S(D) for the total cache's size.Step 2, determine the target caches, which have a free space exceeding or equal to the cache verification size V C B j of the total cache's size.
Step 3, segment each content into smaller sizes based on the available free space in the local SBS and corresponding SBSs.Finally, the Re-distribute and store that segments distributively in step 4. Algorithm 2 Re-Distribute Un-Duplicated Cached Content (RUCC) B j ]: List of the value of the free space of SBSs caches.
Step 1: Determine Target Cached Contents.S = 0; counter of the target cached contents; : Define value of the content verification size; Step 3: Segment Each Target Content into Smaller Sizes.Determine the number of segments according to the

Complexity analysis of the RUCC schemes.
As mentioned, and discussed in (3.2) and algorithm (2), the number of distributed cache is presented by M (only caches of SBSs).As well as, the un-duplicated which segmented is L • , where each cache cached ω • un-duplicated contents.The total un-duplicated contents, which will be segmented is equal to S, where each content will be fragmented into a smaller size equal to Q to be cached at Z caches form M, where Z < = M.The complexity is counted as: The iterative of determining the target contents has time complexity a O(M.ω • ).The iterative of determining the target caches has time complexity O(M).The iterative of determining the segmentation of the target contents and caching their segments in the target caches has time complexity O(S.Z).Therefore, the overall (referring in ( 6)).Additionally, the performance of the distributed cache is evaluated using the following metrics: Cache diversity.To evaluate the number of distinct cached contents, and counter the duplicate contents at the distributed cache, the cache diversity is presented.It is the ratio between the cardinality of unique contents cached in distributed cache ðCard S M j¼1 D l;j Þ and the cardinality of total number of contents in the distributed cache (CardD l,j ).It is taken value between ½ 1 jM � j ; 1�.The cache diversity is denoted by C Diversity , and is given as by [35,36].
where M • = M + 1 is the total number of distributed cache in the distributed network.l is the serial number of the distributed cached content.D is the distributed cached content.
Overall Cache Efficiency (OCE).The Overall Cache Efficiency (OCE) is a metric that quantifies the ratio of the cumulative cache hits to the cumulative demands.It provides an indication of the overall effectiveness of the cache in serving requested content.The OCE, denoted by C OCE , and is calculated based on the formula provided in reference [37]. where is the cumulative of the distributed cache hit ratio.S d is cumulative of the demands.

Numerical results
The evaluation of the performance of both the distributed cache of the SBSs and MSs is presented in this section.

Numerical results of distributed cache performance
The evaluation of the impact of the segmentation and redistribute cached contents on the distributed cache are presented.Because of noncooperation between the Small Base Stations (SBSs), the average of the cache hit ratio of SBS-CoDc shows only a slight increase compared to Each-cache.In Each-Cache, the cache hit ratio is calculated individually for each cache and then summed.However, the proposed scheme surpasses SBS-CoDc by achieving a remarkable 29.17% improvement in the cache hit ratio.This improvement is attributed to the distribution of contents among different SBSs, coupled with the use of the Unified Caching Locator of the UDCL that acts as a map for Mobile Stations (MSs) to locate the cached contents.These enhancements have a significant impact on reducing the traffic load on the access network and backhaul links, resulting in improved EE and SE along with increasing the cache hit ratio of the distributed cache.
5.1.2Distributed cache hit probability.According to (6), the distributed cache hit probability is evaluated based on the content popularity and content caching probability.The That increased the indexing of the popular content, in addition the contents with a high popularity rank increased the probability of finding the request of the mobile from any distributed cache.That increases the cache hit probability.Consequently, the curves of the proposed scheme are not stable because the value of the popularity is calculated on average based on the total number of the contents of distributed cache, which have either the same popularity or in the range as (80-85).

Overall Distributed Cache Efficiency (OCE).
According to (28), Fig 4 shows the OCE of the distributed cache setting with all distributed cache under the impact of the proposed scheme, Each-Cache, and SBS-CoDc.
The seen that the OCE of the proposed scheme rises the fastest among the existing schemes.Until t = 800, the distributed caching entity received a total of 517 requests in average, and the OCE of the proposed scheme, Each-cache, and SBS-CoDc are 65, 102, and 98, respectively, and so on.A higher OCE implicates that MSs can acquire more favorite content from the distributed cache.As a result, the successful distributed cache hits are 450, 800, and 950, by Each-Cache, SBS-CoDc, and proposed scheme, respectively.More specifically, the proposed scheme participates in more than 19% and 16% OCE, compared with the Each-Cache, and SBS-CoDc, respectively.The reason behind that, the setting of the distributed cache has an optimal cache, and a limited cache cannot store all the cached contents with large or medium sizes, in addition, the most popular contents are stored distributively.For that, some time slots were taken by the proposed scheme to obtain enough decisions of each mobile so that to infer mobile's dependency to the target distributed cache based on the availability of contents.Furthermore, the OCE of the proposed scheme increased with an increase of time slots t.

Distributed cache diversity.
To evaluate eliminating the duplicate contents between the distributed cache, the cache diversity is presented to show the similar contents among the The proposed scheme shows a high diversity of caches.This is because the matching and determining the similarity is performed among the contents of all the distributed cache.That generated the replicas of the same contents, which belong to the different caches.Furthermore, the diversity increased when the number of cached contents increases.This is because there are different MSs uploading or requesting for different contents overall the network.The diversity of the proposed scheme can significantly reduce expected run times by utilizing the searching contents within the distributed cache, in addition, optimizing the contents distributed among the distributed cache.That achieved more benefits for the MSs such as more free  space in the distributed cache and reduce the cost of the transmissions.That resulted in the distributed cache being more efficient.

Numerical results of mobiles performance
The performance of the MSs under impact of the UDCL is evaluated in this subsection.

Improvement of Uplink Throughput of MSs.
The average Uplink Throughput (TH) of the proposed scheme is compared with the existing schemes as shown in Fig 7 .The proposed scheme has improved the TH almost by 76.67%, 29.27%, and 17.78% as compared to No-Cache, Each-Cache, and SBS-CoDc, respectively.
The proposed scheme demonstrates significantly improved throughput (TH) for MS compared to the No-Cache scheme, which is expected due to the utilization of caching.Similarly, when compared to Each-Cache, the proposed scheme outperforms it due to its distributed nature.The results clearly indicate that the proposed scheme surpasses SBS-CoDc by leveraging segmentation and redistribution of contents, as opposed to storing the entire content in a cache.Additionally, unlike existing schemes, the proposed approach performs content matching at the MS level rather than the SBS level.This eliminates the need for content upload and leads to a substantial enhancement in throughput.

Improved MS's energy efficiency.
Energy efficiency (EE) serves as a valuable metric for assessing performance enhancements.As observed in previous studies, lower energy consumption typically correlates with improved EE.In this research, the proposed scheme is evaluated and compared to existing schemes based on EE.Referring to Eq (23), the findings depicted in Fig 8 affirm that the average Energy Efficiency (EE) of the proposed scheme surpasses that of existing schemes.In comparison to SBS-CoDc, the proposed scheme exhibits a remarkable improvement in EE, reaching an impressive 78%.This notable enhancement can be attributed to the utilization of the Unified Caching Locator of the UDCL for content matching at the MS level, resulting in a reduction in content uploads when a cache hit occurs.Furthermore, the proposed scheme outperforms other existing schemes by significantly enhancing the average EE by 46% in contrast to SBS-CoDc.This substantial improvement is primarily attributed to the improved hit ratio achieved through the segmentation and redistribution of contents, which eliminates the need to store the entire content in a cache.The proposed scheme exhibits a notable enhancement in Spectral Efficiency (SE), achieving an improvement of approximately 18% compared to SBS-CoDc.Moreover, when compared to Each-Cache, the improvement reaches an impressive 29.27%.The improved SE can be attributed to a significant reduction in the number of uplink contents.The decision-making process at the MS level regarding content upload plays a crucial role.When a cache hit occurs, the contents are not uploaded, effectively conserving bandwidth for other requests from the remaining MSs.Consequently, a substantial amount of spectrum is saved, enabling the accommodation of more requests and ultimately leading to an improved SE.

Conclusion
This paper presented efficient uplink cache schemes in a distributed scenario.The proposed schemes improved energy and spectral efficiency by leveraging content matching among distributed cache, resulting in increased free space and cache hit ratio.Local content matching has reduced the duplicate contents, while providing the MS with a list of cached contents to improve the cache hit ratio.Consequently, the schemes enhanced throughput, EE, and SE of the access network.In rare cases where free space in a distributed cache is insufficient for new content, the distributed cache performed content replacement.Segmentation and redistribution of cached contents with large or medium sizes further improved cache hit ratio, probability, and Overall Cache Efficiency (OCE).Analysis demonstrated that the proposed schemes outperformed existing schemes by increasing cache hit ratio, probability, OCE, and diversity by 29.17%, 74.89%, 24.17%, and 80%, respectively.Moreover, the scheme enhanced TH, SE, and EE of the access network by 17.78%, 18%, and 78%, respectively, and improved EE and SE of the sidehaul and backhaul links of the SBSs.

Fig 6 .
Fig 6.Average of diversity of distributed cache.https://doi.org/10.1371/journal.pone.0299690.g006 Fig 8 illustrates the EE of an MS as the number of MSs increases.

5 . 2 . 3
Improvement in Spectral Efficiency (SE) of MSs.Referring to Eq (26), Fig 9 illustrates the average SE of MSs compared to the No-cache, Each-cache, SBS-CoDc, and the proposed scheme for varying numbers of MSs.

.1 Network model The
cellular network is contained as a cloud, an MBS G, and total M SBSs such that B ¼ fB j ; j ¼ 1; 2; :::; Mg, and U i MSs such that U ¼ fU i ; i ¼ 1; 2; :::; Ng as shown in Fig1Time Division Duplex (TDD) is considered to be scheduled the resources of the MBS, SBSs, and MSs to provide the capability of transmitting and receiving at the same time.The system utilizes an MBS to gather information from all SBSs.The spatial distribution of the SBSs and MSs follows two separate homogeneous Poisson Point Processes (hPPP), denoted as F B and F U , respectively.The SBSs are distributed with a density of λ B , while the MSs are distributed with a density of λ U .This independent and homogeneous distribution allows for a flexible and scalable deployment of the SBSs and MSs throughout the network.The set of MSs, which served by an SBS (B j ) is denoted by U j,i = {U j,i : 1 � i < n, n < N}.
U j,i .
j ;d l and its Attributes P B j ;d l ;k to C

end Step 4: Re-distribute & Storing the Segments Distributively.
then Local SBS send the C B j ;d s or set of segments to B j ; Received the C B j ;d s or set of segments in B j ; Cache the C B j ;d s or set of segments in B j ; end Create Seg d l Map ; end Return Seg Map = {Seg d 1 Map ,. ..,Seg d l Map : 1� l � S}.